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Abstract 

This paper builds on recent research that focuses on regression modeling of con- 
tinuous bounded data, such as proportions measured on a continuous scale. Specif- 
ically, it deals with beta regression models with mixed effects from a Bayesian ap- 
proach. We use a suitable parameterization of the beta law in terms of its mean and 
a precision parameter, and allow both parameters to be modeled through regres- 
sion structures that may involve fixed and random effects. Specification of prior 
distributions is discussed, computational implementation via Gibbs sampling is 
provided, and illustrative examples are presented. 

Keywords: Bayesian analysis; Beta distribution; Beta regression; Continuous 
proportions; Mixed models. 



1. Introduction 

Mixed-effects models have been widely employed in statistical analysis, main- 
ly in the area of health, and their study has been primarily restricted to response 
variables with normal or, at least, symmetrical distributions. Such models are 
not appropriately applicable when the response variable has a support restricted 
to a doubly bounded interval. Limited-range variables are, however, common 
in practice; for example, proportions are bounded between zero and one. This 
paper proposes a Bayesian analysis for mixed-effects regression models that are 
tailored for situations where the response variable is measured on a continuous 
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scale and is restricted to the unit interval (0, 1). Situations where the response, 
say y, is limited to a known interval (a, b) is also accommodated through the 
transformation y* = (y — a)/(b — a). The response variable is assumed to be 
beta distributed with mean (and possibly a precision parameter) modeled using 
fixed and random effects. The substantial advantage to consider a beta modeling 
is due to the flexibility that it provides. In fact, the beta family includes left or 
right skewed, symmetric, J-shaped, and inverted J-shaped distributions. 



Following Ferrari and Cribari-Neto (2004|), our proposed model uses a param- 



eterization of the beta law in terms of its mean and an additional positive parameter 
that can be regarded as a precision parameter. The mean of the response variable 
is conveniently linked with a mixed-effects regression structure by the logit link 
function. An extended version of such a model is also considered. It assumes 
that the precision parameter is not constant over the observations, but rather it is 
related to a mixed-effects function through a log link. 

To formulate our proposed models, we adopt a Bayesian approach. We address 
the issues of model fitting via Gibbs sampling, choice of prior distributions, and 
model selection based on the deviance information criterion, the expected Akaike 
information criterion and the expected Bayesian information criterion. Simulated 
and real data analysis are presented for illustration. An appendix presents various 
pieces of BUGS code used for fitting the mixed beta regression. 

2. Bayesian mixed beta regression 

Due to the flexibility of the beta distribution in terms of the variety of density 
shapes that can be accommodated, this distribution is a natural choice for model- 
ing continuous data that are restricted to the interval (0,1). The probability density 
function of a variable y following a beta distribution parameterized in terms of its 
mean ji (0 < /x < 1) and a precision parameter (0 > 0) is given by 

where T(-) denotes the gamma function. Note that <p can be interpreted as a pre- 
cision parameter, since fi = E(y) and Var(y) = — /i)/(l + <p) and, hence, for 
each fixed value of the mean fi, 1 + (j> is inversely proportional to the variance of 
y. If y has density function ([I]), we write y ~ beta(/i0, (1 — fi)(j)). 

Now, let 2/1, • • • , y n be n independent random variables such that 
yi ~ beta(/i0, (1 — /i)0). The definition of a beta regression model requires a 
transformation of the mean iij of yi, i = 1, . . . , n, that maps the interval (0, 1) 



2 



onto the real line. A convenient and popular link function is the logit link. It 
is then assumed that ln{/Xj/(l — /x^)} = xJ/3, where X{ is a vector of known 
covariates for the z-th subject and (3 denotes a vector of regression coefficients. 
The first element of Xi is usually taken as 1 to allow for an intercept. 

The precision parameter <p may be assumed to be constant over observations 
(Fer rari and Cribari-Neto[ 12004) or it may be modeled in terms of a regression 
structure ( Smithson and Verkuilen, 2006). Since the precision parameter is strictly 
positive, the log link function is a natural choice. It is then assumed that ln(0i) = 
wj5 where Wi is a vector of covariates and 5 denotes a vector of unknown re- 
gression coefficients. Again, it is convenient to take the first element of u>; as 1 
to allow for an intercept in the precision description. There is no restriction on 
whether or not the W{S contain the same predictor variables as XiS. 

The beta regression model described above does not involve random effects. 



Extending previous works on Bayesian generalized linear models (Dey et al. 



2000) and Bayesian beta regression (Bra nscum et al.[|2"007| ), we define below two 



mixed beta regression models, the first of which assumes that the precision param- 
eter is the same for all the observations, and the second involves a mixed-effects 
model for the precision parameter. 

Let yi, . . . , y m be independent continuous random vectors, where 
Vi — {Dili ■ ■ ■ iViriiY represents an observed response vector for a sample unit 
i and for which each of its components, y^, takes values on the interval (0, 1). 
Consider also a regression model with the following structure: 

G(E( yi \bi)) = Xtf + Zfo, (2) 

i = 1, . . . , m, where £?(•) is a vector- function linking the conditional mean re- 
sponse vector E(yi\bi) with the linear mixed model r/j = Xi/3 + ZJbi, for which 
Xi is the design matrix of dimension rij x p corresponding to the vector (3 = 
(/?!,..., (3pY of regression coefficients (the fixed effects) and Zi is the design 
matrix of dimension x q associated with the vector 6j = (bn, . . . , b iq ) T (the 
random effects). 

For the logit link function, the j-th component of ([2]) is 

In 1 1 ^ | = Vij = x Jjl3 + z Jj b i> 

where /% = E(y ij \b i ), x^ = (x^i, . . . , x ijp ) J , and % = . . . , z ijq ) J , which 
is equivalent to 

= expfaj) = exp(sT.ff + zj-bj) 
^ 1 + exp(rfo) 1 + exp(xJjP + zjfii) ' 
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In this work, we first assume that for i = 1, 2, . . . , m and j = 1, 2, . . . , Hi 

yij\bi, (3, 4> 4 ~' beta(/%0, (1 - fiij)<f)), 

i.e., conditionally on bi, f3, and <fi, the y^'s are independent and have probability 
density function given by ([T|), with fi replaced by fj,^, which is specified by Q. 
Note that in this formulation, <p represents a common precision parameter. 

In mixed models, the random effects b±, . . . , b m are typically assumed to be 

independent and normally distributed, namely 6j|E& l ~'iV 9 (0, E 6 ), i = 1, . . . , m, 
where E b is a positive-definite matrix. The normality assumption, however, can 
be inappropriate in practical applications where the measurements present out- 
liers. In these cases, it is more adequate to consider multivariate distributions 
with heavier- than-normal tails for the random effects. Consequently, a multivari- 
ate t-distribution with v b > degrees of freedom, location vector /ij = 6 1 ? 
and positive-definite dispersion matrix E fe is a better candidate to model the ran- 
dom effects 6j's, i.e., bi\u b , E 6 4 ~'£ g (z/ fc , 0, E b ), i — 1, . . . , m. It should be noticed 
here that for large values of v\, the multivariate t-distribution is approximately a 
multivariate normal distribution. 

In the mixed beta regression model proposed above, the precision parameter 

is constant over the observations. For a more general formulation of this model, 
we consider a different precision parameter, say <fiij, for each response y^. We 
then assume a mixed linear model for the logarithm of 0^-, namely 

lnOij) = Tij = wjj5 + hjjdi, (4) 

where wjj = (Wyi , . . . , Wij P * ) is the design vector corresponding to the p* x 1 vec- 
tor 8 of fixed effects and hjj = (Ziyi, . . . , hij q * ) is the design vector corresponding 
to the q* x 1 vector di of random effects. Note that the design matrices Wi = 
(wn, . . . , w irii ) T and Hi = (ha, . . . , h in .) T may, but are not required to, con- 
tain the same predictor variables as the matrices Xi = (xn, . . . , x in J T and Z { = 

(zn, . . . , z in .) T , respectively. Here, it may be assumed that <ij|E/~W g (0, E d ), 

1 = 1, . . . , m, where E d is a positive-definite matrix. Alternatively, we may as- 
sume that di\v d , Y,d~'t q (v d , 0, E d ), i = 1, . . . ,m. 

In order to complete the Bayesian specification of the beta mixed models de- 
scribed above, elicitation of prior distributions for all unknown parameters is re- 
quired. Multivariate normal prior distributions are typically considered for the 
fixed effects, i.e., (3 ~ N p (fj,p, E^). Vague priors are usually specified by taking 
large values for the prior variances. However, the impact of the scale choice under 
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the normal model cannot be neglected. An alternative strategy is to consider a 
multivariate i-distribution, i.e., (3 ~ t p (vp, ftp, Hp) and to specify an appropriated 
value for vp, the degrees of freedom parameter. If the vector of random effects 
is assumed to follow a multivariate i-distributed, i.e., bi\f b , fx b , E b ~ t g (u b , 0, E b ), 



then the prior distribution for the degrees of freedom can be discrete as in Albert 



and Chib| ( |1993| ) and |Besag et al.| ( |1995j ), or continuous as in |Geweke1 ( |1993| ). We 



have chosen the latter alternative. More specifically, we consider an exponential 
prior distribution with mean 1/a for the degrees of freedom, say e(a). The prior 
distribution for the scale matrix of random effects E& is chosen, mainly for compu- 



tational simplicity, to be an inverted Wishart distribution as in Fong et al. (2010), 
i.e., E b ~ IW q {ip,c). An alternative prior distribution for E b is a constrained 
Wishart distribution ( |Everson and Morris} |2000). 



We now turn to the specification of prior distributions for the precision pa- 
rameter. As mentioned above, in this paper we study the following beta mixed 
regression models. 

Model 1: It considers the mixed regression model ([3]) for the location parameters 
fjLij and a common precision parameter <p for each observation y t j. In the Bayesian 
context, a natural choice for the prior distribution of the precision parameter would 
be an inverse gamma distribution. If a slightly informative prior is required, it can 



be assumed that ~ IG(e, e), with a small fixed positive value for e. Gelman 
(2006) suggests that the prior distribution = U 2 with U ~ U(Q,a) with large 



a (a = 50 for example) is less informative than an inverse gamma prior. Here, 
we propose a more flexible prior distribution for <p that includes Gelman's prior 
distribution as a special case. More specifically, we propose the following prior 
specification for 0: = (aB) 2 , where B ~ beta(l + e, 1 + e), for given positive 
values for a and e. 

Model 2: It considers the mixed regression model ([3]) for the location parameter 
Hij and a different precision parameter 0^- for each y^, where 0y is modeled as 
in ([4]). Here, the specification of prior distributions for S and the parameters of 
the distribution of the cfjS is similar to that used for (3 and the parameters of the 
distribution of the fe^s. 

3. Model fitting using Markov chain Monte Carlo sampling 



it n, , 



Let y T = (yj,..., y£) and r] T = ($,..., rj£), where r]J = (r] iU . . . ,77* 
Note that, by assumption, conditionally on (3, E b , and u b , the ?^s are independent 
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and have density function f(rji\(3, E 6 , v b ) oc f(bi\j3, E 6 , f b ), « = 1, . . . , m. We 
now present the following results for the joint posterior distribution under models 
1 and 2 described in the previous section. 

Under model 1, and the assumption that the parameters E 6 , v b , 0, and /3 are 
independent, the joint posterior density is 



f(P,Hb,Vb,<i>,ri\y) « 



m rii 



YlYlfivijhjA) 



=1 3=1 



X 



i=i 



f(Z b )f(v b )f(<f>)f((3). 



Gibbs sampling can be used to generate a Monte Carlo sample from the joint 
posterior density, f(/3, E 6 , 0, ?7|y). The Gibbs sampler in this context involves 
iteratively sampling from the full conditional distributions: 

f(E b \v b ,/3,(j),r},y), /(i/ 6 |E b , 0, 77, y), /(/3|E b , i/ b , 0, 77, y), 

f(cf)\f3,Y, b ,v b ,rj,y), and f(r)i\rj k , f3, E 6 , u b , 0, i, A; = 1, . . . , m, % ^ k, 

which can be implemented in the WinBUGS software. Posterior inferences on (5, 
Eft, and and, more importantly, on the mean responses (/i^; i = 1, . . . ,m,j = 
1, . . . , rii) are readily obtained in WinBUGS. Hypothesis testing regarding regres- 
sion coefficients and mean responses are also straightforward. 

We now turn to model 2. Let t t = (rj , . . . , rjj, where r^ 7 = (t^i, . . . , r ifli ). 
By assumption, conditionally on 5, Y, d , and v d , the TjS are independent and have 
density function /(rj|5, E^, u d ) oc S^, v d ), % = 1, . . . ,m. Assuming prior 

independence of E b , v b , E d , v d , 5, and we obtain the posterior density given by 



f(P,i: b ,v b ,5,Y: d ,v d ,ri,T\y) oc 



X 



nn / (,/ r 

.i=i j=i 

m 

Hf(rn\p,J: b ,u b ) 

i=l 



Y[f(Ti\S, E d , v d ) 



i=l 



x f(Z b )f(is b )f(5)f(Z d )f(is d )f((3). 

Similarly to model 1, the Gibbs sampling can be used to generate a Monte Carlo 
sample from f(/3, E b , v b , 5, E d , v d , 77, r\y). In this case, the Gibbs sampler involves 
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iteratively sampling from the following full conditional distributions: 

f(T, b \u b: ft 5, E d , v d , 77, r, y), f(v b \Y, b , ft 5, E d , i/ d , 77, r, y), 

/(/3|E 6 , i/ 6 , 5, E d , 77, r, 7/), /(5|ft E b , z/ 6 , E d , u d , rj, r, y), 

/(E d |ft E b , z/ b , <5, z/ d , 7/, r, y), E b , z/ b , 5, E d , 77, r, y), 

fiVilVk, r k , ft E b , z/ fe , 5, E d , z/ d , yft and f{n\T k ,r] k , ft E b , z/ 6 , 5, E d , i/ d , y;), 

for i, k = 1, . . . , m, and i ^ k, which can be also implemented in the Win- 
BUGS software. Thus, posterior inferences on ft E b , and 0^, for i = 1, . . . , m, 
j = 1, . . . , rii, and on the mean responses /i^, for i = 1, . . . , m, j = 1, . . . , rii 
are easily obtained in WinBUGS. Again, hypothesis testing regarding regression 
coefficients and mean responses are also straightforward. 



4. Illustration via simulations 

To illustrate the proposed methodology, we consider the following mixed beta 
regression model with simulated data (model 1): 

yijlk, <f>,0 ~ beta(/jjj0, (1 - 

where /3 = (ft, f3 2 , ft) T , h = (b a , b i2 ) T , 




= Vij = (Pi + hi) + (P2 + b i2 )xi j2 + (3 3 x ij3 , 



i = 1, . . . , 100, j = 1, . . . , 5, and bi\v b ,lu b ~ t 2 {y bl 0, E b ). For our simulation 
study, the values of the covariates were generated from a uniform distribution in 
the unit interval, and we set v b = 10, = 49, f3 = (—2, 1, 2) T , and 




As proposed in Section 2, we adopt the following prior specifications: v h ~ 

e(a), E 6 ~ IW 2 (*,c), and f3 = (ft, ft, ft) T ~ h{v p ,n P ,^) with a = 0.1, 
c = 5, 



*=( 2 ° 2 ° ). ^ =10. ,,,, = (0.0.0) . E,= j Hi 



10 
10 
10 
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We first analyze a single simulated dataset under different prior specifications 
for the precision parameter. The following prior distributions for <p were consid- 
ered: (i) <p ~ IG(e, e), with e = 0.001 (model la); (ii) <p = U 2 , with U ~ £7(0, 50) 
(model lb); (iii) = (505) 2 where B ~ beta(l + e, 1 + e), with e = 0.1 (model 
lc) and 0.5 (model Id); (iv) ln(</>) ~ t(vp, ftp, aV) with vp = 10, /ip = 0, cr 2 = 10 
(model le). Note that (iii) corresponds to our proposal (see Section 2). A sen- 
sitivity analysis for prior specification of the precision parameter can be carried 
out from the figures in Table 1 , which reports the deviance information criterion 
(DIC) proposed by Spiegelhalte r et al.| ( |2002[ ), the expected Akaike information 
criterion (EAIC) introduced by Brooks] (2002), and the expected Bayesian infor- 
mation criterion (EBIC) given in Carlin and Louis (2001) for the fitted models 
with different prior distributions for (p. We observe that the different proposed pri- 
ors lead to similar DICs, EAICs and EAICs. However, the three criteria indicate 
that model Id shows a slightly better fit than the other proposals. 



Table 1 : DIC, EAIC and EBIC for the fitted models with different prior specifications for the 
precision parameter under model 1; simulated dataset 



Model 


Prior for 


DIC 


EAIC 


EBIC 


model la 


<j) ~ IG(0.01, 0.01) 


-1279.02 


-1411.84 


-1378.12 


model lb 


</> = U 2 ,U~ [7(0,50) 


-1279.09 


-1411.92 


-1378.2 


model lc 


<p= (50B) 2 ,B~beta(l.l,l.l) 


-1279.26 


-1412.10 


-1378.38 


model Id 


<P= (505) 2 , J B~beta(1.5, 1.5) 


-1279.84 


-1412.69 


-1378.98 


model le 


ln(0) ~ t(10,0,5) 


-1275.52 


-1406.92 


-1373.20 



In Table 2, we report the parameter estimates for model Id. These results show 
that the estimated parameters from the Bayesian methodology proposed here are 
similar to the true values of the model parameters. In our simulation study, we 
consider 100,000 Monte Carlo iterations and the results are presented consider- 
ing the last 90,000 iterations. In addition, the necessary diagnostic tests (such as 
convergence, autocorrelation, history) were performed, from which desirable be- 
haviors were observed in the chains (for brevity detailed numerical results are not 
shown but are commented below). We also conducted a sensitivity analysis with 
respect to the prior specifications of the regression coefficients and the dispersion 
matrix of the random effects coefficients. In each case, the posterior inferences 
were not appreciably altered in comparison with the results presented in Table 2. 
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Table 2: True mean and estimated posterior medians and means, 95% credibility intervals ( CI) 
for model Id; simulated dataset 



Parameter 






Posterior Inference 




True 


Mean 


MC Error 


Median 


95% CI 


Pi 


-2.000 


-2.094 


0.003 


-2.094 


(-2.329, -1.865) 


Pi 


1.000 


1.074 


0.001 


1.075 


(0.906, 1.241) 


Ps 


2.000 


2.000 


0.000 


2.000 


(1.873, 2.126) 


<t> 


49.000 


49.280 


0.038 


49.150 


(41.350, 57.800) 


Wb 


10.000 


7.086 


0.058 


5.338 


(2.223,23.100) 




1.000 


0.883 


0.002 


0.867 


(0.490, 1.369) 


S &12 


-0.300 


-0.182 


0.000 


-0.173 


(-0.393, -0.024) 


^622 


0.200 


0.242 


0.001 


0.231 


(0.082, 0.466) 



The multivariate version of Gelman and Rubin's convergence diagnostic pro- 
posed by Brooks and Gelman ( 1998 1 indicates that the chain is convergent since 
the multivariate proportional scale reduction factor (mprf) equals 1.01. Also, for 
each parameter, we checked that the convergence is achieved for each chain. The 
latter conclusion is corroborated by three different convergence tests, namely Gel- 
man and Rubin's convergence diagnostic ( |Gelman and Rubin] |1992| ), Geweke's 
diagnostic ( |Geweke[|1992[ ), and Heidelberg and Welch's diagnostic (Heidelberger 



and Welch] < \ 1 98 1[ ) and Heidelberger and Welch ( 1983 )), which were obtained us- 



ing the libraries lattice and coda (Plummer et al. 2006| ) of the R sofware 
(freely available from http : // www.r-proj ect.or g/ ) . To obtain Gelman and Rubin's 
convergence diagnostic, we started two chains in different initial points and per- 
formed 100, 000 Monte Carlo iterations, considering the last 90, 000 iterations. In 
addition, history and autocorrelation plots (not shown) suggest that the chain for 
each parameter is stationary and not correlated, respectively. These results are 
essential to achieve an adequate estimation of the parameters. 

We now turn to a simulation study in which we investigate the convenience 
of assuming a multivariate t distribution for the random effects. We consider 
different values for u b (y h = 5, 10 and 50). For each value of v b , we generate 
N = 100 datasets from the mixed beta regression model Id (see above; the same 
values for the parameters and sample size are used). For each sample we fit the 
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model under the assumption of multivariate t and multiavariate normal distributed 
random effects. We compute the bias and the root-mean-square error (VMSE) for 
each parameter estimator over the N samples under the different settings. Table 3 
presents summary results for the estimation of all the parameters. Also, for each 
sample, we compute the information criteria DIC, EAIC and EBIC for both fits. 
Table 4 presents the mean DIC, EAIC and EBIC over the simulated samples. 

Overall, figures in Table 3 suggest that, when the data are heavy-tailed dis- 
tributed (say Vb = 5, 10), the performance of the posterior estimates obtained 
from the fit of the model that assumes a multivariate t distribution for the random 
effects is better than that of the posterior estimates taken from the normal fit. From 
Table 4, advantage of the multivariate t specification for the random effects over 
the normal specification is clear, more so when u b is small. 

Table 3: Summary results based on 100 simulated datasets; t and normal fits 



fl> Fit Posterior Inference 









ft 


ft 


ft 




Xb 12 


^6 22 


<t> 




5 


t 


Bias 


0.006 


-0.031 


0.003 


-0.108 


0.140 


0.119 


0.294 


2.209 






Vmse 


0.104 


0.078 


0.062 


0.251 


0.177 


0.161 


3.900 


4.528 




normal 


Bias 


0.019 


-0.036 


0.003 


0.386 


0.048 


0.292 


-0.460 








VMSE 


0.120 


0.088 


0.063 


0.477 


0.183 


0.329 


4.031 




10 


t 


Bias 


0.015 


-0,016 


-0.005 


-0.134 


0.128 


0.092 


-1,281 


0.074 






Vmse 


0.112 


0.092 


0.064 


0.229 


0.161 


0.144 


4.285 


3.768 




normal 


Bias 


0.016 


-0.016 


-0.005 


0.172 


0.071 


0.192 


-1,609 








VMSE 


0.114 


0.092 


0.065 


0.276 


0.146 


0.241 


4.480 




50 


t 


Bias 


0.020 


-0.018 


-0.005 


-0.163 


0.145 


0.065 


-2.062 


-17.930 






VMSE 


0.114 


0.085 


0.061 


0.235 


0.173 


0.135 


4.814 


20.379 




normal 


Bias 


0.018 


-0.018 


-0.007 


-0.045 


0.124 


0.103 


-2.121 








Vmse 


0.116 


0.085 


0.061 


0.181 


0.162 


0.172 


4.917 





We now use the same set of simulated dataset as in the beginning of this section 
to fit model 2, with five different regression structures for the precision parame- 
ter. Note that the true (unknown) model is a mixed beta regression model with 
constant precision, and hence only model 2a corresponds to the true model. Prior 
distributions for the parameters u b , S b and f3 are the same as those proposed for 
model 1. Also, for model specifications that include random effects for the pre- 
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Table 4: Mean DIC, EAIC and EBIC based on 100 simulated datasets; t and normal fits 





Fit 


DIC 


EAIC 


EBIC 


5 


t 


-1319.86 


-1456.27 


-1422.55 




normal 


-1313.76 


-1450.40 


-1416.68 


10 




-1276.30 


-1408.81 


-1375.09 




normal 


-1274.88 


-1406.58 


-1372.86 


50 


t 


-1249.76 


-1375.84 


-1342.12 




normal 


-1249.91 


-1375.59 


-1341.87 



cision parameter, we assume that the precision random effects di have the same 
distribution as the location random effects namely t(u b , 0, £&). Table 5 reports 
the DIC, EAIC and EBIC for the five fitted models using simulated data. We ob- 
serve that the submodels for the precision parameter that do not include random 
effects achieve the best fits to our data, models 2a, 2c and 2d being similarly good. 
However, model 2a, the model under which the data were simulated and which is 
equivalent to model le, provides a better fit than the other proposals. Therefore, 
the best fitted model agrees with the true model. 



Table 5: DIC, EAIC and EBIC for the fitted models with different specifications of the precision 
parameter (model 2); simulated dataset 



Model 


Precision Mixed Model 

(M&i)) 


DIC 


EAIC 


EBIC 


model 2a 


Si 


-1275.52 


-1406.92 


-1373.20 


model 2b 


Si + dn 


-1268.37 


-1399.64 


-1365.92 


model 2c 


Si + S 3 Xij 3 


-1274.96 


-1406.34 


-1372.62 


model 2d 


Si + S 2 Xij2 + SzXij 3 


-1273.85 


-1405.16 


-1371.44 


model 2e 


(Si + (In) + (<5 2 + d i2 )xij2 + S 3 Xij 3 


-1270.13 


-1401.42 


-1367.70 



Note: Models 2a-2e assumes the same location sub-model, namely logit(/iij) = (/3i +bn) + (^2 + ^>il)^ijl +^ij3/33- 



Table 6 reports the parameter estimates under model 2a. It can be seen that 
the estimates obtained through the Bayesian methodoly proposed here are similar 
to the corresponding true values of the parameters. To fit model 2a, we consid- 
ered 100, 000 Monte Carlo iterations and the estimates were obtained using the 
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last 90, 000 iterations. We obtained mprf = 1.00 < 1.2, indicating that the chain 
is convergent. Diagnostic plots (not shown) suggest that the chain for each pa- 
rameter is not correlated and stationary, respectively. Hence, our estimates are 
reliable. 

Table 6: True mean and estimated posterior medians and means, 95% credibility intervals ( CI) 
for model 2a; simulated data 



Parameter Posterior Estimation 





True 


Mean 


MC Error 


Median 


95% CI 


Pi 


-2.000 


-2.091 


0.003 


-2.091 


(-2.325,-1.852) 


h 


1.000 


1.073 


0.001 


1.073 


(0.905,1.242) 


Pz 


2.000 


1.999 


0.000 


1.999 


(1.872,2.128) 


Si 


3.892 


3.885 


0.000 


3.887 


(3.715,4.048) 


Vb 


10.000 


7.083 


0.059 


5.357 


(2.258,23.000) 




1.000 


0.882 


0.002 


0.866 


(0.497, 1.374) 




-0.300 


-0.180 


0.000 


-0.171 


(-0.392,-0.023) 


^f>22 


0.200 


0.239 


0.001 


0.228 


(0.078,0.457) 



5. A real data application 



We now consider the dataset reported by Prater (1956). The response vari- 
able is the proportion of crude oil converted into gasoline after distillation and 
fractionation. The dataset contains 32 observations on the response and on other 
variables. By sorting the data, it is clear that there are only 10 crudes involved. A 
potentially useful covariate is the end point (EP), i.e., the temperature (in degrees 



Fahrenheit) at which all gasoline has vaporized. Ferrari and Cribari-Neto (2004) 
fitted a beta regression model with constant precision to these data, in which the 
batches of crude oil are treated as a fixed factor with ten levels and with a fixed 



slope for the end point. Instead, Venables (2000) suggested that the batches should 
be viewed as a random factor. Graphical inspection of the data suggests that a lo- 
cation submodel with random intercepts and a common slope may be suitable for 
the data. 

At the outset, we consider a mixed beta regression model with a constant pre- 
cision parameter (model 1). The location submodel involves random intercepts 
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and a common slope. Table 7 gives the DIC, EAIC and EBIC for the model fit- 
ting with different prior specifications for the precision parameter 0. As before, 
for the parameters z/5, £&, and (3 we considered the prior distributions i/& ~ e(a), 
E 6 ~ IW 2 (^, c), and (3 ~ t 2 (vp, Hp, S^) with 



It can be noticed that the different proposed priors provide similar DIC, EAIC 
and EBIC values. The smallest EAIC and EBIC values are obtained by a beta 
prior with e = 0.1 and the smallest DIC is reached by a beta prior with e = 0.5. 



Table 7: DIC, EAIC and EBIC for the fitted models with different prior specifications of the preci- 
sion parameter under model 1; Prater's data 


Model 


Prior for <p 


DIC 


EAIC 


EBIC 


model 1.1 


0~ 7G(0.01,0.01) 


-141.485 


-138.853 


-128.593 


model 1.2 


= U 2 ,U~£/(O,5O) 


-141.228 


-139.598 


-129.338 


model 1.3 


(j)= (50S) 2 ,B~beta(l.l,l.l) 


-142.062 


-140.069 


-129.809 


model 1.4 


(j)= (50S) 2 ,5~beta(1.5,1.5) 


-142.025 


-140.120 


-129.860 



Note. Models 1.1-1.4 assumes the same location sub-model, namely logit(/ijj) = + + foEPij- 



We now assume that <fi is not constant through the observations. Again, the lo- 
cation submodel assumes random intercepts and a common slope. As in the sim- 
ulation study, we consider that both random effects, the 5;S and rf^s, are identically 
distributed with distribution t(u, 0, E) (so that v d = u b = v and E fe = E d = E 
in our previous notation). Table 8 gives the DIC, EIAC and EBIC values for the 
model fitting under different precision submodels (models 2.1-2.6). Note that 
model 2.1 is the same as model 1, i.e., it implies constant precision but with a dif- 
ferent prior for (p, namely ln(0) ~ t(10, 0, 5). Tables 9 and 10 give the posterior 
estimates of the parameters associated with models 1.4 and 2.5, which provide 
the best fits for constant and noncontant precision, respectively. Between the con- 
stant precision model (model 1.4) and the variable precision model (model 2.5), 
the DIC, EAIC and EBIC values suggest that the later is the best. It means that 
not only the location submodel but also the precision submodel are affected by a 
random additive effect and the end point (EP). Also, there is no evidence of as- 
sociation between the random effects since zero belongs to the credibility interval 





20 
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for £ 12 . It can also be noticed that the covariate EP affects both the mean and the 
precision of the proportion of crude oil converted into gasoline positively. 



Table 8: DIC, EAIC and EBIC for the fitted models with different specifications for the precision 
parameter under model 2; Prater's data 



Model 


Precision submodel 

(M0y)) 


DIC 


EAIC 


EBIC 


model 2.1 


Sx 


-141.382 


-139.006 


-128.746 


model 2.2 


Si + dn 


-140.324 


-138.970 


-128.716 


model 2.3 


5 2 EP ij 


-144.944 


-142.596 


-132.336 


model 2.4 


S x + S 2 EP ij 


-144.219 


-142.239 


-131.980 


model 2.5 


dix + S 2 EP ij 


-146.026 


-145.086 


-134.826 


model 2.6 


{S l + d il ) + S 2 EP ij 


-144.839 


-144.066 


-133.806 



Note: Models 2.1-2.6 assumes the same location sub-model, namely logit(/j;j ) = (/3i + bn) + fi^EPi 



Table 9: Estimated posterior medians and means, 95% credibility intervals ( CI) for the mixed beta 
regression model 1.4; Prater's data 



Parameter Posterior Inference 





Mean 


MC Error 


Median 


95% CI 




-5.116 


0.004 


-5.112 


(-5.631,-4.628) 


02 


10.730 xl(T 3 


0.533xl0- 3 


10.740 xl0~ 3 


(9.629xl0- 3 ,11.760xl0- 3 ) 


;/ 


12.99 


0.222 


9.334 


(1.098,45.72) 


<$> 


296.1 


1.387 


289.100 


(142.100,500.400) 


En 


0.204 


0.002 


0.175 


(0.041,0.519) 


El2 


0.464 xl0~ 3 


2.309 xl0~ 3 


1.337xl0- 3 


(-0.329,0.320) 


E22 


0.121 


0.005 


0.044 


(0.007,0.703) 



Some technical details relating to the fit of the models are now in order. We 
considered 200, 000 Monte Carlo iterations and our results were obtained consid- 
ering the last 190, 000 iterations. Additionally, we performed the diagnostic tests 
reported for the simulated data, all of which suggested suitable behavior of the 
chains. For model 1.4, the multivariate version of Gelman and Rubin's conver- 
gence diagnostic ( Brooks and Gelman] 1998) indicates that the chain is conver- 
gent (mprf = 1.09 < 1.2). Also, diagnostic plots (not shown) suggest that the 
chain for each parameter is not correlated and stationary, respectively, while Fig- 
ure [T] demonstrates that the posterior densitity function for each parameter does 
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Table 10: Estimated posterior medians and means, 95% credibility intervals (CI) for tlie mixed 
beta regression model 2.5; Prater's data 



Parameter Posterior Inference in location sub-model 



Mean MC Error Median 95% CI 

~fh -4.783 O008 -4.780 (-5.348, -4.233) 



>> 



9.892X10" 3 0.015X10" 3 9.898xl0" 3 (8.673x 10" 3 , 11.080X 10" 3 ) 

<5 2 17.150xl0~ 3 0.008xl0~ 3 17.190xl0~ 3 (14.770X 10" 3 , 19.330 x 10~ 3 ) 

v 13.190 0.066 9.333 (1.343, 47.01) 

En 0.179 0.000 0.158 (0.039, 0.444) 

S12 -0.930X10- 3 1.478xl0~ 3 0.358xl0~ 3 (-0.290, 0.283) 

S 22 0.108 0.002 0.041 (0.007, 0.627) 



not present multimodality; it should be noted that multimodality can be accompa- 
nied by convergence problems. We can then assume that the estimates reported 
in Table 9 are reliable. For model 2.5, similar diagnostic evidence was obtained. 
Here, mprf = 1.06 and diagnostic plots (not shown) and Figure [2] suggest that the 
results in Table 10 can be trusted. 



6. Discussion 



Beta regression modeling has gained increasing popularity after the work of 



Ferrari and Cribari-Neto (2004), who described a beta regression model param- 
eterized in terms of the mean response and a common precision parameter, and 
developed frequentist inference and basic diagnostic tools for the proposed model. 



A complementary approach proposed by Smithson and Verkuilen (2006) consid- 
ers that the precision parameter is not fixed but, instead, is modeled in a regression 



manner. A Bayesian beta regression model was studied by Branscum et al. (2007 ). 
In this paper, we extended these ideas for a mixed beta regression model under a 
Bayesian perspective. 

The present paper considered Bayesian inference for mixed beta regression 
based on two different approaches. First, the precision parameter was assumed 
to be fixed, i.e., the same for all observations. A linear regression structure was 
proposed for the mean parameter through a logit link function. Our results are 
readily extended to other link choices. Specification of different priors for the 
common precision parameter was studied. We considered a prior distribution for 
of the type <p = U 2 , with U ~ £7(0, a), where it is common to consider as ini- 
tial value a = 50 (Gelman 2006). We also proposed alternative priors, namely, 
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Figure 1: Density; Prater's data; constant <p; model 1.4 



prior distributions of the type = (aB) 2 , with B ~ beta(l + e, 1 + e) and 
e = 0.001, 0.01, 0.1, 0.5, . . ., which delivered good results in terms of model fit 
and performance of diagnostics tests. Second, the precision parameter was mod- 
eled through its own linear regression structure using a log link. Again, other 
choices of the precision link function can be accommodated. For both the mean 
and the precision submodels, a mixed-effects model with a multivariate t distri- 
bution for the fixed and the random effects was considered. Our empirical appli- 
cations yielded good results in terms of model fit and diagnostic tests. It is worth 
mentioning that in this context, it is necessary to perform a careful model selec- 
tion for the precision modeling including more or fewer fixed and random effects 
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Density of psiinv[1,1] 
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Figure 2: Density; Prater's data; non constant (f>; model 2.5 



since it is not clear in advance which model is more plausible. 

A classic version of this problem was raised by Zimp rich| (2010 ), where mixed 



beta regression models were estimated using the SAS procedure NLMIXED dSAS 



( |2008[ )), employing adaptive Gaussian quadrature. This approach achieves good 
results, but the implementation of the mixed beta regression model for random 
effects that are non-normally distributed is very challenging. In this sense, our 
approach is more flexible because one can easily implement it when the distri- 
bution of the random effects follow a normal, Student-t, skew normal or another 
distribution, by using simple and accessible software such as WinBUGS. Another 
advantage of this approach is the easy implementation for the imputation of miss- 
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ing data (Carrig an et al.[ |2007[ ), a common situation in practice and for which a 
classic approach is much more complicated. 



Appendix: BUGS codes for the mixed beta regression 

This appendix presents the various pieces of BUGS code used for fitting the 
mixed beta regression in the simulated data example. 

Inverse gamma prior for (f> 

model 

{ 

for ( i in 1 : m ) { 

for ( j in 1 : n ) { 

Y[i , j] " dbeta (al [i, j] ,a2[i,j]) 

al[i,j] <- mu[i , j]*phi 

a2[i,j] <- (l-mu[i , j])*phi 

logit(mu[i , j]) <- inprod(x[i, j, ], beta [ ] ) +inprod ( z [ i , j, ], b[i,l,]) 

} 

b[i,l,l:q ] " dmt (cerovec [ ] ,psi[ , ],gll) 

} 

gl 1 "dexp (aO ) 
beta[l:p] " dmt (alpha [ ] , VI [ , ],gl2) 
Vl[l:p ,l:p] <- inverse(V[ , ]) 
psi[l:q,l:q] " dwish (RO [ , ], cO) 
psiinv [ 1 : q, 1 : q] <-inverse (psi [ 1 :q, 1 : q] ) 
phiinv ~ dgamma (aOO, aOO) 
phi<-l/phiinv 

} 

Uniform prior for <j> 

model 

{ 

for ( i in 1 : m ) { 

for ( j in 1 : n ) { 

Y[i , j] " dbeta (al [i, j] ,a2[i,j]) 

al[i,j] <- mu[i , j]*phi 

a2[i,j] <- (l-mu[i , j])*phi 

logit(mu[i , j]) <- inprod(x[i, j, ], beta [ ] ) tinprod ( z [ i , j, ], b[i,l,]) 

} 

b[i,l,l:q ] ~ dmt (cerovec [ ] ,psi[ , ],gll) 
} 

gll "dexp (aO ) 

beta[l:p] " dmt (alpha [ ] , VI [ , ],gl2) 
Vl[l:p ,l:p] <- inverse(V[ , ]) 
psi[l:q,l:q] " dwish (RO [ , ], cO) 
psiinv [ 1 : q, 1 : q] <-inverse (psi[l:q,l:q]) 

phir ~ dunif (a00,b01) 
phi<- phir+phir 
} 

Beta prior for (j> 
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model 

{ 

for ( i in 1 : m ) { 

for ( j in 1 : n ) { 

Y[i , j] ~ dbeta (al [i, j] ,a2[i,j]) 

al[i,j] <_ mu[i , j]*phi 

a2[i,j] <- (l-mu[i , j])*phi 

logit(mu[i , j]) <- inprod(x[i, j, ], beta [ ] ) +inprod ( z [ i , j, ], b[i,l,]) 

} 

b[i,l,l:q ] " dmt (cerovec [ ] ,psi[ , ],gll) 

} 

gll ~dexp (aO ) 

beta[l:p] " dmt(alpha[ ] , VI [ , ],gl2) 
Vl[l:p ,l:p] <- inverse(V[ , ]) 
psi[l:q,l:q] " dwish (RO [ , ], cO) 
psiinv [ 1 : q, 1 : q] <- inverse (psi[l:q,l:q]) 
phiinicial ~ dbeta (aOO , bO ) 
phi<- (phiinicial*bl 1 ) * (phiinicial*bll) 
} 

Submodel for <j> 

model 
{ 

for ( i in 1 : m ) { 
for ( j in 1 : n ) { 

Y[i , j] ~ dbeta (al [i, j] ,a2[i,j]) 
al[i,j] <- mu[i , j]*phi[i,j] 
a2[i,j] <- (l-mu[i , j])*phi[i,j] 

log (phi [ i , j ] ) <-inprod (x [ i , j, ], delta [ ] ) linprod ( z [ i , j, ], gama[i,l,]) 
logit(mu[i , j]) <- inprod(x[i, j, ], beta [ ] ) tinprod ( z [ i , j, ], b[i,l,]) 

} 

b[i,l,l:q ] " dmt (cerovec [ ] ,psi[ , ],gll) 
gama [ i , 1 , 1 : q ] ~ dmt (cerovec [ ] ,psi[ , ],gll) 
} 

gll ~dexp (aO ) 

beta[l:p] " dmt(alpha[ ] , VI [ , ],gl2) 
delta [l:p] " dmt(alpha[ ] , VI [ , ],gl2) 
Vl[l:p ,l:p] <- inverse(V[ , ]) 
psi[l:q,l:q] " dwish (RO [ , ], cO) 
psiinv [ 1 : q, 1 : q] <- inverse (psi[l:q,l:q]> 
meanphi<-mean (phi [, ] ) 
} 
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